Predicting Edit Locations on Wikipedia using Revision History

نویسندگان

  • Caitlin Colgrove
  • Julie Tibshirani
  • Remington Wong
چکیده

There has been increasing interest in the machine learning community in automatic task design. In a collaborative problem-solving setting, how can we best break up and assign tasks so as to optimize output? Huang et al., for example, considered the problem of effectively assigning image-labeling tasks to Amazon Mechanical Turkers [1]. In the realm of Wikipedia prediction, Cosley et al. created a successful system for recommending articles a user might be interested in editing [2]. Nicknamed SuggestBot, the system makes use of co-editing patterns and articles the user has previously edited to predict new articles of interest. This system represents a success in automatic task design: deploying SuggestBot increased a user’s edit rate an average of fourfold.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Wikipedia Revision Toolkit: Efficiently Accessing Wikipedia's Edit History

We present an open-source toolkit which allows (i) to reconstruct past states of Wikipedia, and (ii) to efficiently access the edit history of Wikipedia articles. Reconstructing past states of Wikipedia is a prerequisite for reproducing previous experimental work based on Wikipedia. Beyond that, the edit history of Wikipedia articles has been shown to be a valuable knowledge source for NLP, but...

متن کامل

Wikipedia Revision Graph Extraction Based on N-Gram Cover

During the past decade, mass collaboration systems have emerged and thrived on the WorldWide Web, with numerous user contents generated. As one of such systems, Wikipedia allows users to add and edit articles in this encyclopedic knowledge base and piles of revisions have been contributed. Wikipedia maintains a linear record of edit history with timestamp for each article, which includes precio...

متن کامل

WHAD: Wikipedia historical attributes data - Historical structured data extraction and vandalism detection from the Wikipedia edit history

This paper describes the generation of temporally anchored infobox attribute data from the Wikipedia history of revisions. By mining (attribute, value) pairs from the revision history of the English Wikipedia we are able to collect a comprehensive knowledge base that contains data on how attributes change over time. When dealing with the Wikipedia edit history, vandalic and erroneous edits are ...

متن کامل

Using Language Models to Detect Wikipedia Vandalism

This paper explores a statistical language modeling approach for detecting Wikipedia vandalism. Wikipedia is a popular and influential collaborative information system. The collaborative nature of authoring, as well as the high visibility of its content, have exposed Wikipedia articles to vandalism, defined as malicious editing intended to compromise the integrity of the content of articles. Ex...

متن کامل

Web-based Validation for Contextual Targeted Paraphrasing

In this work, we present a scenario where contextual targeted paraphrasing of sub-sentential phrases is performed automatically to support the task of text revision. Candidate paraphrases are obtained from a preexisting repertoire and validated in the context of the original sentence using information derived from the Web. We report on experiments on French, where the original sentences to be r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011